Data In, Fact Out: Automated Monitoring of Facts by FactWatcher

نویسندگان

  • Naeemul Hassan
  • Afroza Sultana
  • You Wu
  • Gensheng Zhang
  • Chengkai Li
  • Jun Yang
  • Cong Yu
چکیده

Towards computational journalism, we present FactWatcher, a system that helps journalists identify data-backed, attention-seizing facts which serve as leads to news stories. FactWatcher discovers three types of facts, including situational facts, one-of-the-few facts, and prominent streaks, through a unified suite of data model, algorithm framework, and fact ranking measure. Given an appendonly database, upon the arrival of a new tuple, FactWatcher monitors if the tuple triggers any new facts. Its algorithms efficiently search for facts without exhaustively testing all possible ones. Furthermore, FactWatcher provides multiple features in striving for an end-to-end system, including fact ranking, fact-to-statement translation and keyword-based fact search. 1. MOTIVATION Computational journalism [1, 2] is a young field that assists journalism using computing. One of its objectives is to find news leads backed up by hard, factual data. In the last several years, our research in this thrust has been focused on automatic and algorithmic fact finding by database and data mining techniques [3, 5, 4, 6]. Specifically, we studied how to monitor three types of facts that can be expressed as the following factual statements: Situational fact [4] “The social world’s most viral photo ever generated 3.5 million likes, 170,000 comments and 460,000 shares by Wednesday afternoon.” (http://www.cnbc.com/id/49728455) A situational fact is about a contextual skyline object within a certain context (e.g., all photos posted to Facebook) with regard to several measures (e.g., number of “likes”, number of “comments”, and number of “shares”), i.e., the object is not dominated by any object in the context when they are compared by the measures. One-of-the-few [5] “Victor Oladipo scored 30 points and handed out 14 assists ... only three other rookies have recorded at least 30 points and 14 assists in a game ...” (http://espn.go.com/espn/elias? date=20140222) This statement is about a one-of-the-four object, which is only dominated by at most three other objects. Prominent streak [3, 6] “This month the Chinese capital has experienced 10 days with a maximum temperature in around 35 degrees Celsius—the most for the month of July in a decade.” (http:// www.chinadaily.com.cn/china/2010-07/27/content 11055675.htm) A prominent streak is a long consecutive subsequence (e.g., 10 days of This work is licensed under the Creative Commons AttributionNonCommercial-NoDerivs 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/3.0/. Obtain permission prior to any use beyond those covered by the license. Contact copyright holder by emailing [email protected]. Articles from this volume were invited to present their results at the 40th International Conference on Very Large Data Bases, September 1st 5th 2014, Hangzhou, China. Proceedings of the VLDB Endowment, Vol. 7, No. 13 Copyright 2014 VLDB Endowment 2150-8097/14/08. Situational Facts (<*,*,DAL>,{pts,reb}) One-of-the-Few (, {pts, reb}) Prominent Streaks (, {pts}) Facts id player team ... pts reb ... t1 Larmar Clippers ... 12 9 ... t2 Larmar Clippers ... 8 11 ... ... ... ... ... ... ... ... t7 House Heat ... 8 6 ... t8 Larmar Clippers ... 10 11 ... Algorithms

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detection of children's activities in smart home based on deep learning approach

 Monitoring behavior of children in the home is the extremely important to avoid the possible injuries. Therefore, an automated monitoring system for monitoring behavior of children by researchers has been considered. The first step for designing and executing an automated monitoring system on children's behavior in closed spaces is possible with recognize their activity by the sensors in the e...

متن کامل

Detection of children's activities in smart home based on deep learning approach

 Monitoring behavior of children in the home is the extremely important to avoid the possible injuries. Therefore, an automated monitoring system for monitoring behavior of children by researchers has been considered. The first step for designing and executing an automated monitoring system on children's behavior in closed spaces is possible with recognize their activity by the sensors in the e...

متن کامل

Face Detection with methods based on color by using Artificial Neural Network

The face Detection methodsis used in order to provide security. The mentioned methods problems are that it cannot be categorized because of the great differences and varieties in the face of individuals. In this paper, face Detection methods has been presented for overcoming upon these problems based on skin color datum. The researcher gathered a face database of 30 individuals consisting of ov...

متن کامل

Non-ionizing electromagnetic waves measurement and monitoring Systems

In this paper, portable and online monitoring systems for non-ionizing EMF measurement, as well as some measurement results in Tehran are discussed. First of all, standards and recommendations related to EMF monitoring, measurement and exposure such as ISIRI No.8567, ICNIRP guideline and ITU-K.83 are reviewed. Then, technical facts and operation of portable device and monitoring station are int...

متن کامل

Fact-Checking of Reports in Kalamat-e Anjoman Using Archival Documents on the History of Kashan during the Qajar Era

This research aims to do a content review on the materials printed and published in “Kalamat-e Anjoman” about the history of Kashan. This study also assesses these materials using archival documents in order to confirm or refute the contents. This research used a descriptive/analytical method and the data were obtained from “Kalamat-e Anjoman” and archival documents. Findings show that Abdolras...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2014